**Opening First Dataset 
use "/Users/lewiskrashnsky/Documents/Princeton/Chris Referendum Project/Data/BC Merged Data/First Period/2001 General Election.dta"

**Merging. Results of merge: 7,774 matches. 2,549 not matched. 
merge m:m EDVA_CODE_F ED_ABBREVIATION using "/Users/lewiskrashnsky/Documents/Princeton/Chris Referendum Project/Data/BC Merged Data/First Period/2005 Referendum Data.dta" 

rename _merge _merge1

*Second merge. Results: 7,975 matches. 3,826 not matched. 
merge m:m EDVA_CODE_F ED_ABBREVIATION using "/Users/lewiskrashnsky/Documents/Princeton/Chris Referendum Project/Data/BC Merged Data/First Period/2005 General Election.dta" 

rename _merge _merge2 


**Dropping absentee, mail-in-ballots, special ballots, voting in DEO office (reason: cannot satisfy assumption that same individuals are voting with this method in multiple years)**
drop if strmatch(EDVA_CODE_F, "*Absentee*")
drop if strmatch(EDVA_CODE_F, "*Voting in DEO office*")
drop if strmatch(EDVA_CODE_F, "*Voting by mail*")
drop if strmatch(EDVA_CODE_F, "*Special*")

*Check duplicates 
duplicates report EDVA_CODE_F
duplicates examples EDVA_CODE_F
drop in 4275

***Checking for odd vote totals between referendum and concurrent election in 2005*** 
gen weird_total = total_party_vote_2005-Total_Ref_Votes_2005
summarize weird_total

gen weird_total_percent = (total_party_vote_2005-Total_Ref_Votes_2005)/total_party_vote_2005
summarize weird_total_percent

gen check_story = weird_total_percent if _merge1==3 & _merge2==3 & weird_total_percent>=.10 | weird_total_percent<=-.10

**Cleaning data to account for weird totals between 2005 referendum and election  
replace No_Ref_Vote_2005 = . in 229
replace Total_Ref_Votes_2005 = . in 229
replace Total_Ref_Votes_2005 = 310 in 229
replace Rejected_Ballots_Ref_2005 = "" in 229
replace check_story = . in 229
replace weird_total_percent = . in 229
replace Yes_Ref_Vote_2005 = 151 in 229
replace No_Ref_Vote_2005 = 159 in 229

**Dropping the weird total observations that cannot be explained (44 observations out of 10,000 - less than .01%)
drop in 208 
drop in 729
drop in 897
drop in 389
drop in 391
drop in 397
drop in 414
drop in 633
drop in 664
drop in 904
drop in 1464
drop in 2049
drop in 3756
drop in 4752
drop in 1295
drop in 4073/4074
drop in 6367/6368
drop in 6108
drop in 6165
drop in 6183
drop in 7000
drop in 7027/7028
drop in 7335/7336
drop in 7387/7388
drop in 7815
drop in 8323/8324
drop in 8345
drop in 8366
drop in 8376
drop in 8400
drop in 8632
drop in 8726
drop in 9448
drop in 9515
drop in 9800


*Destringing other party variables 
destring NDP_Vote_2001, replace force 
destring Green_Party_Vote_2001, replace force 
destring UPBC_Vote_2001, replace force 
destring BCM_Vote_2001, replace force
destring FREE_Vote_2001, replace force
destring PF_Vote_2001, replace force
destring ANP_Vote_2001, replace force
destring POC_Vote_2001, replace force
destring IND_Vote_2001, replace force
destring CP_Vote_2001, replace force
destring RP_Vote_2001, replace force
destring OTHER_Vote_2001, replace force

destring NDP_Vote_2005, replace force
destring Green_Party_Vote_2005, replace force
destring Marijuana_Party_Vote_2005, replace force
destring Freedom_Party_Vote_2005, replace force
destring Independent_Vote_2005, replace force
destring DR_BC_Vote_2005, replace force
destring Platinum_Party_Vote_2005, replace force
destring Work_Less_Party_Votes_2005, replace force
destring Conservative_Vote_2005, replace force
destring BC_Refed_Vote_2005, replace force
destring Communist_Party_Vote_2005, replace force
destring Libertarian_Votes_2005, replace force
destring Peoples_Front_Vote_2005, replace force
destring Other_Vote_2005, replace force 


***Cleaning data: checking if vote totals are accurate
**Checking if vote totals line up with party vote totals in 2001 
egen total_party_vote_2001 = rowtotal(Liberal_Party_Vote_2001 NDP_Vote_2001 Green_Party_Vote_2001 UPBC_Vote_2001 BCM_Vote_2001 FREE_Vote_2001 PF_Vote_2001 ANP_Vote_2001 POC_Vote_2001 IND_Vote_2001 CP_Vote_2001 RP_Vote_2001 OTHER_Vote_2001)

gen party_total_vote_difference_2001 = total_party_vote_2001-Total_Valid_Votes_2001

gen problematic_total_2001 = party_total_vote_difference_2001 if party_total_vote_difference_2001 != 0

summarize problematic_total_2001

*Making last necessary change to fix totals in 2001 
replace Total_Valid_Votes_2001 = 390 in 5696
replace Rejected_Ballots_2001 = "" in 5696
replace Registered_Voters_2001 = "" in 5696
replace Rejected_Ballots_2001 = "1" in 5696
replace problematic_total_2001 = . in 5696

*Dropping created variables for checking vote totals in 2001 
drop total_party_vote_2001
drop party_total_vote_difference_2001
drop problematic_total_2001

***Vote Totals were manually constructed in 2005 using party votes provided so not necessary to check totals of downloaded dataset*** 

***Checking if 2005 Referendum Vote totals line up 
egen check_ref_vote_total = rowtotal(Yes_Ref_Vote_2005 No_Ref_Vote_2005)

gen Ref_2005_vote_diff = check_ref_vote_total-Total_Ref_Votes_2005

gen problematic_total_2005_ref = Ref_2005_vote_diff if Ref_2005_vote_diff != 0

*summarize: 2 problematic vote totals 
summarize problematic_total_2005_ref

*Evident problem: totals were shifted one column to the right (incredibly high rejected vote number). Fixed manually
 replace Yes_Ref_Vote_2005 = 566 in 231

replace No_Ref_Vote_2005 = 717 in 231


replace Total_Ref_Votes_2005 = 1283 in 231


replace Rejected_Ballots_Ref_2005 = "41" in 231


replace Registered_Votes_Ref_2005 = . in 231


replace Yes_Ref_Vote_2005 = 498 in 232

replace No_Ref_Vote_2005 = 401 in 232


replace Total_Ref_Votes_2005 = 899 in 232


replace Rejected_Ballots_Ref_2005 = "20" in 232


replace Registered_Votes_Ref_2005 = . in 232


*Dropping created variables for checking vote total in 2005 referendum 
drop problematic_total_2005_ref
drop Ref_2005_vote_diff 
drop check_ref_vote_total


*Ensuring Percentage variables are clean (one real change made)
replace Liberal_Vote_Percentage_2001 = . if Liberal_Vote_Percentage_2001>1
replace Liberal_Vote_Percentage_2001 = . if Liberal_Vote_Percentage_2001<0

replace Percent_Liberal_Vote_2005 = . if Percent_Liberal_Vote_2005>1
replace Percent_Liberal_Vote_2005 = . if Percent_Liberal_Vote_2005<0

replace Yes_Ref_Vote_Percent_2005 = . if Yes_Ref_Vote_Percent_2005>1
replace Yes_Ref_Vote_Percent_2005 = . if Yes_Ref_Vote_Percent_2005<0


**Checking for Outlers 
scatter Liberal_Vote_Percentage_2001 Percent_Liberal_Vote_2005
scatter Total_Valid_Votes_2001 total_party_vote_2005
scatter total_party_vote_2005 Total_Ref_Votes_2005


**Creating new variables for analysis
**Dependent Variable 
gen Percent_NO_Ref_Vote_2005 = No_Ref_Vote_2005/Total_Ref_Votes_2005

*NDP Vote Percentage
gen Percent_NDP_Vote_2001 = NDP_Vote_2001/Total_Valid_Votes_2001
gen Percent_NDP_Vote_2005 = NDP_Vote_2005/total_party_vote_2005


*Green Party Vote Percentage 
gen Percent_Green_Vote_2001 = Green_Party_Vote_2001/Total_Valid_Votes_2001
gen Percent_Green_Vote_2005 = Green_Party_Vote_2005/total_party_vote_2005


***Analysis***

*Scatter plots: Liberal Voting 
graph twoway (lfit Yes_Ref_Vote_Percent_2005 Liberal_Vote_Percentage_2001) (scatter Yes_Ref_Vote_Percent_2005 Liberal_Vote_Percentage_2001)


graph twoway (lfit Yes_Ref_Vote_Percent_2005 Percent_Liberal_Vote_2005) (scatter  Yes_Ref_Vote_Percent_2005 Percent_Liberal_Vote_2005)


graph twoway (lfit Percent_NO_Ref_Vote_2005 Liberal_Vote_Percentage_2001) (scatter Percent_NO_Ref_Vote_2005 Liberal_Vote_Percentage_2001)


graph twoway (lfit Percent_NO_Ref_Vote_2005 Percent_Liberal_Vote_2005) (scatter  Percent_NO_Ref_Vote_2005 Percent_Liberal_Vote_2005)



*Scatter plots: NDP Voting 
graph twoway (lfit Yes_Ref_Vote_Percent_2005 Percent_NDP_Vote_2001) (scatter Yes_Ref_Vote_Percent_2005 Percent_NDP_Vote_2001)

graph twoway (lfit Yes_Ref_Vote_Percent_2005 Percent_NDP_Vote_2005) (scatter Yes_Ref_Vote_Percent_2005 Percent_NDP_Vote_2005)


graph twoway (lfit Percent_NO_Ref_Vote_2005 Percent_NDP_Vote_2001) (scatter Percent_NO_Ref_Vote_2005 Percent_NDP_Vote_2001)


graph twoway (lfit Percent_NO_Ref_Vote_2005 Percent_NDP_Vote_2005) (scatter Percent_NO_Ref_Vote_2005 Percent_NDP_Vote_2005)



*Scatter plots: Green Party Voting 	
graph twoway (lfit Yes_Ref_Vote_Percent_2005 Percent_Green_Vote_2001) (scatter Yes_Ref_Vote_Percent_2005 Percent_Green_Vote_2001)


graph twoway (lfit Yes_Ref_Vote_Percent_2005 Percent_Green_Vote_2005) (scatter Yes_Ref_Vote_Percent_2005 Percent_Green_Vote_2005)

graph twoway (lfit Percent_NO_Ref_Vote_2005 Percent_Green_Vote_2001) (scatter Percent_NO_Ref_Vote_2005 Percent_Green_Vote_2001)


graph twoway (lfit Percent_NO_Ref_Vote_2005 Percent_Green_Vote_2005) (scatter Percent_NO_Ref_Vote_2005 Percent_Green_Vote_2005)
	
	
**OLS Models
*Liberal Vote models 
regress Yes_Ref_Vote_Percent_2005 Liberal_Vote_Percentage_2001, robust 
outreg2 using Yes_vote_liberal_2001.doc, replace ctitle (Yes_Vote_Percentage_2005 (OLS))

regress Yes_Ref_Vote_Percent_2005 Liberal_Vote_Percentage_2001 Total_Valid_Votes_2001, robust 

regress Yes_Ref_Vote_Percent_2005 Percent_Liberal_Vote_2005, robust
outreg2 using Yes_vote_liberal_2005.doc, replace ctitle (Yes_Vote_Percentage_2005 (OLS))

regress Yes_Ref_Vote_Percent_2005 Percent_Liberal_Vote_2005 total_party_vote_2005, robust

regress Percent_NO_Ref_Vote_2005 Percent_Liberal_Vote_2005, robust 
outreg2 using No_vote_liberal_2005.doc, replace ctitle (No_Vote_Percentage_2005 (OLS))


*NDP Vote models 
regress Yes_Ref_Vote_Percent_2005 Percent_NDP_Vote_2001, robust 
outreg2 using Yes_vote_ndp_2001.doc, replace ctitle (Yes_Vote_Percentage_2005 (OLS))

regress Yes_Ref_Vote_Percent_2005 Percent_NDP_Vote_2005, robust 
outreg2 using Yes_vote_ndp_2005.doc, replace ctitle (Yes_Vote_Percentage_2005 (OLS))


regress Percent_NO_Ref_Vote_2005 Percent_NDP_Vote_2005, robust 
outreg2 using No_vote_ndp_2005.doc, replace ctitle (Percent Opposition to Electoral Reform, 2005 Referendum (OLS))


***Checking on residuals with high NDP vote and more varied referendum vote***
regress Yes_Ref_Vote_Percent_2005 Percent_NDP_Vote_2005 if Percent_NDP_Vote_2005>.75, robust 

scatter Yes_Ref_Vote_Percent_2005 Percent_NDP_Vote_2005 if Percent_NDP_Vote_2005>.75

*Mean referendum voting by different percentage points, shows that it does not rise on average as you get to higher tiers of NDP percent. Yet, the mean is still consistently around 60% voting yes to referendum - so the relationship still holds 
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.7
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.8


**checking on ED's that were NDP big victories, means are consistently over 50% for Referendum voting 
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "NEL"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "WKB"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "CLR"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "YAL"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "MRP"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "SRG"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "SRN"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "SRP"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "SWH"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "NEW"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "POR"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "VHA"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "VKE"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "VKI"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "VMP"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "ALQ"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "CWL"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "NAN"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "ESM"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.5 & ED_ABBREVIATION == "MJF"


mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "NEL"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "WKB"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "CLR"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "YAL"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "MRP"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "SRG"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "SRN"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "SRP"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "SWH"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "NEW"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "POR"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "VHA"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "VKE"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "VKI"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "VMP"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "ALQ"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "CWL"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "NAN"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "ESM"
mean Yes_Ref_Vote_Percent_2005 if Percent_NDP_Vote_2005>.6 & ED_ABBREVIATION == "MJF"




***Average support among Liberal Vote subsets** 

mean Percent_NO_Ref_Vote_2005 if Percent_Liberal_Vote_2005>=.2 & Percent_Liberal_Vote_2005<=.3 

mean Percent_NO_Ref_Vote_2005 if Percent_Liberal_Vote_2005>=.3 & Percent_Liberal_Vote_2005<=.4 

mean Percent_NO_Ref_Vote_2005 if Percent_Liberal_Vote_2005>=.2 & Percent_Liberal_Vote_2005<=.4 

mean Percent_NO_Ref_Vote_2005 if Percent_Liberal_Vote_2005>=.4 & Percent_Liberal_Vote_2005<=.5 

mean Percent_NO_Ref_Vote_2005 if Percent_Liberal_Vote_2005>=.5 & Percent_Liberal_Vote_2005<=.6

mean Percent_NO_Ref_Vote_2005 if Percent_Liberal_Vote_2005>=.4 & Percent_Liberal_Vote_2005<=.6

mean Percent_NO_Ref_Vote_2005 if Percent_Liberal_Vote_2005>=.6 & Percent_Liberal_Vote_2005<=.7

mean Percent_NO_Ref_Vote_2005 if Percent_Liberal_Vote_2005>=.7 & Percent_Liberal_Vote_2005<=.8

mean Percent_NO_Ref_Vote_2005 if Percent_Liberal_Vote_2005>=.6 & Percent_Liberal_Vote_2005<=.8


mean Yes_Ref_Vote_Percent_2005 if Percent_Liberal_Vote_2005>=.2 & Percent_Liberal_Vote_2005<=.3 

mean Yes_Ref_Vote_Percent_2005 if Percent_Liberal_Vote_2005>=.4 & Percent_Liberal_Vote_2005<=.5 

mean Yes_Ref_Vote_Percent_2005 if Percent_Liberal_Vote_2005>=.7 & Percent_Liberal_Vote_2005<=.8 


**Average support among NDP vote subsets** 
mean Percent_NO_Ref_Vote_2005 if Percent_NDP_Vote_2005>=.2 & Percent_NDP_Vote_2005<=.4 

mean Percent_NO_Ref_Vote_2005 if Percent_NDP_Vote_2005>=.2 & Percent_NDP_Vote_2005<=.3 

mean Percent_NO_Ref_Vote_2005 if Percent_NDP_Vote_2005>=.4 & Percent_NDP_Vote_2005<=.5 

mean Percent_NO_Ref_Vote_2005 if Percent_NDP_Vote_2005>=.7 & Percent_NDP_Vote_2005<=.8




**Figures with Lowess fit** 
lowess Yes_Ref_Vote_Percent_2005 Percent_Liberal_Vote_2005

lowess Yes_Ref_Vote_Percent_2005 Percent_NDP_Vote_2005

**Trying normal transformation of data** 
gen normal_Ref_percent_vote = invnormal(Yes_Ref_Vote_Percent_2005)

gen normal_Liberal_2005_percent_vote = invnormal(Percent_Liberal_Vote_2005)

gen normal_NDP_2005_vote = invnormal(Percent_NDP_Vote_2005)

*Figures of normal transformations of data 
lowess normal_Ref_percent_vote normal_Liberal_2005_percent_vote

lowess normal_Ref_percent_vote normal_NDP_2005_vote


***Multivariate Regression models
gen non_main = 1 - (Percent_Liberal_Vote_2005 + Percent_NDP_Vote_2005 + Percent_Green_Vote_2005)

regress Yes_Ref_Vote_Percent_2005 Percent_Liberal_Vote_2005 Percent_NDP_Vote_2005 Percent_Green_Vote_2005 non_main, nocons robust 
outreg2 using Multivariate_2005_Models.doc, replace ctitle (Percentage Support for Electoral Reform, 2005 Referendum (OLS))

regress Yes_Ref_Vote_Percent_2005 Percent_Liberal_Vote_2005 Percent_NDP_Vote_2005 Percent_Green_Vote_2005, robust 


regress Yes_Ref_Vote_Percent_2005 Percent_Liberal_Vote_2005 Percent_NDP_Vote_2005, robust 


regress Yes_Ref_Vote_Percent_2005 Percent_Liberal_Vote_2005 Percent_Green_Vote_2005, robust
outreg2 using Multivariate_2005_Models_2.doc, replace ctitle (Percentage Support for Electoral Reform, 2005 Referendum (OLS))

regress Yes_Ref_Vote_Percent_2005 Percent_NDP_Vote_2005 Percent_Green_Vote_2005, robust
outreg2 using Multivariate_2005_Models_3.doc, replace ctitle (Percentage Support for Electoral Reform, 2005 Referendum (OLS))



**Final model 
gen cons_2005 = Conservative_Vote_2005/total_party_vote_2005
replace cons_2005 = 0 if cons_2005==.

gen Lib_Con_2005_frac = (cons_2005 + Percent_Liberal_Vote_2005)
sum Lib_Con_2005_frac

*Unweighted
regress Yes_Ref_Vote_Percent_2005 Lib_Con_2005_frac, robust 
outreg2 using Supplementary_2005_Regression_Unweighted.doc, replace ctitle (Percentage Support for Electoral Reform, 2005 Referendum (OLS))

*Weighted
regress Yes_Ref_Vote_Percent_2005 Lib_Con_2005_frac [pweight= total_party_vote_2005], robust 
outreg2 using Supplementary_2005_Regression_Weighted.doc, replace ctitle (Percentage Support for Electoral Reform, 2005 Referendum (OLS))
